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SUB-BAND SPEECH CODING SYSTEM 



Field of Invention 

This invention relates to speech coder based on code excited linear prediction (CELP) 
coding and, more particularly, to a sub-band speech coder. 



Background of Invention 

-I Speech compression is a fundamental part of digital communication systems. In a 

n traditional telephone network, the speech signal is a narrow band signal that is band limited to 4 
kHz. Many of the new emerging applications do not require the speech bandwidth to be limited. 

,15 Hence, wideband signals with a signal bandwidth of 50 to 7,0000 Hz, resulting in a higher 
perceived quality, are rapidly becoming more attractive for new appUcation such as voice over 

U Internet Protocol, or third generation wireless services. Consequently, digital coding of 

:j wideband speech is becoming increasingly important. 

Code-Excited Linear Prediction (CELP) is a well-known class of speech coding 

20 algorithms with good performance at low to medium bit rates (4 to 16 kb/s) for narrow band 
speech. See B.S. Atal and Schroeder's article entitled "Stochastic Coding of Speech Signals 
at Very Low Bit Rates," IEEE International conference on Acoustics, Speech and Signal 
Processing, May 1984. For wide band speech, the same algorithm can be used over the entire 
input bandwidth with some degree of success. Alternatively, the input signal can be decomposed 

25 into two or more sub-bands which are coded independently. In these sub-band coders the signal 
is downsampled, coded, and upsampled again. In traditional sub-band coders, the signal is 
critically subsampled. Some anti-ahasing filters with non-zero transition bands used in practical 
applications introduce some leakage between the bands, which causes sometimes audible 
aliasing distortions. Quadrature Mirror Filters (QMF) where the aliasing is cancelled out during 

30 resynthesis can be used in the case of equal sub-band decomposition. In the general case of 
unequal sub-band, critical subsampling introduces aliasing. 
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Summary of Invention 

In accordance with one embodiment of the present invention, a wideband coder is 
provided wherein the bandwidth is subdivided into sub-bands which may be unequal. The lower 
sub-band is downsampled and encoded using a CELP coder. A higher sub-band is not 
5 downsampled, but is computed over the entire frequency range and the band-pass filtered to 
complement the lower band. 

Description of the Drawings 

fig. 1 is a block diagram of the coding system according to one embodiment of the 
"'1 0 present invention; 

III FIG. 2 is a block diagram of a random noise generator decoder; 

r] FIG. 3 is a block diagram of a gain-excited LPC decoder; 

' -^ FIG. 4 is a block diagram of a gain-matched by synthesis decoder; and 

FIG. 5 is a block diagram of a pulse excitation decoder. 

[Ms 

Description of Preferred Embodiment of the Present Invention 

n Referring to FIG. 1, there is illustrated a sub-band coder system according to one 

embodiment of the present invention. CELP coders operate on fixed-length segments of the 
input called firames. The coder comprises an encoder/decoder pair. The encoder processes each 
20 fi'ame of speech by computing a set of parameters which it codes and transmits to a decoder. 
The decoder receives this information and synthesizes an approximation to the input speech, 
called coded speech. 

The input speech is sampled at a same firequency fs (16 kHz for example) at A/D (analog 
to digital) converter 11 and has a signal bandwidth of fs/2 (8 kHz). For coding purposes, this 

25 bandwidth is sub-divided into two, possibly unequal, sub-bands. For example, consider a 
wideband speech coder operating at 16 kHz with a useful signal bandwidth of 50 to 7,000 Hz. A 
reasonable low-band bandwidth could be 0 to 5.33 kHz (illustrated in FIG. 2) obtained by 
upsampling by 2 (nfs) at upsampler 13 (32 kHz), low-pass fihering with a lowpass filter 15 with 
a transition band between, for example, 5 and 5.33 kHz and downsampled by 3 {nfs/3) at 

30 downsampler 17, resulting in a 10.67 kHz sampled low band signal. The dovrasampled (10.67 
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kHz) lower-band signal is encoded using a CELP coder 18. The low-band parameters from the 
IPC coder comprise linear prediction (LPC) coefficients, which specify a time-varying all-pole 
filter (LPC filter) and excitation parameters. The excitation parameters specify a time-domain 
waveform called the excitation signal, which comprises adaptive and fixed excitation 
5 contributions and corresponding gain factors (gain, LPC, adaptive codebook index and fixed 
codebook index). 

The high-band signal is obtained from the original by simply band-pass or highpass 
filtering it before applying to a highband coder 20, An appropriate bandwidth can be between fsx 
13 mdfsi such as 5,33 and 7 kHz. The 16 kHz input, for the example, is band-pass filtered between 
C|0 5.33 kHz and 7 kHz to obtain the high-band signal. The transition band of this filter would have 
J;J^ to be between 5 and 5.33 kHz and designed to complement the low-band low-pass filter. The 
bandpass filtered output is coded in a highband coder 20. There are several possible ways to 
ri generate the high-band excitation coder 20, such as random noise, noise excited LPC, gain- 
= matched analysis-by-synthesis, multi-pulse coding or a combination. 

Ill 5 The encoded signal is transmitted to the decoder via a transmission medium such as a 

cable or wireless network. At the decoder, the lowband excitation signal is reconstructed at the 

O low band rate of 10.67 kHz (2^/3)and this is applied to the CELP decoder (LPC synthesis filter) 
21. The output of the CELP decoder 21 is upsampled at upsampler 23 (upsampled by 3) to 2fs 
(32 kHz) and low-pass filtered at filter 25 at 5.33 kHz and downsampled by downsampler 26 

20 (downsampled at 2) to at 16 kHz to form the low-band coded signal. The high band signal of 
/5 (16 kHz) is generated at highband pass decoder 27 at the original sampling rate and bandpass 
fihered at bandpass filter 29 to obtain theX16 kHz) high-band coded signal. The 16 kHz signal 
is bandpass filtered between 5.33 kHz and 8 kHz to obtain the high band signal. The transition 
of this filter is between 5 and 5.33 kHz and designed to complement the low-band low-pass 

25 filter. The high- and low-band contributions are added at adder 30 to obtain the coded speech 
signal. 

As discussed above, there are several high-band excitation coding methods. 
The simplest model is a gain-scaled random noise generator as illustrated in FIG. 2. In 
this case, the bits represent quantified gain value and is used for a scale factor. The random 
30 noise generator 31 output is multiplied at multiplier 32 by this scale factor and bandpass filtered 
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at filter 35 to approximate the high-band signal. A second highband decoding is illustrated in 
FIG. 3 where after the noise generator 37 and gain muhiplier 38 controlled by the gain value of a 
lookuptable accessed by the input bits , the resulting signal is passed through an LPC sjmthesis 
filter 39 (different from the one used in the low band) controlled by the input bits. The order of 
5 this filter and the size of the LPC synthesis filter codebook can be small The intent is to apply 
some frequency shaping to the high-band noise. The output is filtered by bandpass filter 40. 

In the gain-matched analysis by synthesis, the random noise generator is replaced by a 
codebook 41 containing allowable excitation vectors accessed by the input bits. The excitation 
vector which minimizes the error between the synthetic signal and the input, under the constraint 
'40 that the output gain matches the input gain, is selected. The selected vectors are scaled or gain 
III controlled at multipHer 43 by input bits and the resulting output is applied through LPC 
ff. synthesizer fiher 45 controlled by the input bits. The LPC synthesis filter 45 output is applied to 
bandpass filter 47. This is explained in more detail by E. Paksoy, A. McCree and V. 
lj, Viswanathan in "A Variable-Rate Multimodal Speech Coder With Gain-Matched Analysis by 
lis Synthesis," IEEE International Conference on Acoustics, Speech and Signal Processing, April, 
H 1997. 

Another possibility is to use simple ternary pulse coding as illustrated in FIG. 5 in the 
high band, where the highband signal is approximated by a waveform (generated at pulse 
excitation generator 51) which consists of mostly zero elements, save for a few that have an 

20 amplitude of +1 or -1. This excitation waveform is gain-scaled at multiplier 53 and filtered 
through an LPC synthesis filter 55 and the highband band-pass filter 56 to produce the coded 
high-band signal. The search for the excitation and gain are done through an analysis-by- 
synthesis mechanism common in CELP coders. The high band coder 20 performs the 
complement of the decoding. 

25 Any combination of the above techniques can also be used in such a subband coder. It 

should also be noted that the subband coding scheme could also be extended to more than two 
subbands. 

We have described a subband coder where the high-band is not subsampled. The 
filtering and sampling rate conversion scheme is relatively simple and has the advantages of 
30 reduced complexity and reduced aliasing problems in the case of unequal subbands. We have 
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also proposed several high-band coding methods and discussed bandpass random noise 
generation, LPC spectral shaping, gain-matched analysis-by-synthesis, and ternary pulse coding. 
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